Reproducible research
with Nix and rix

Short intro to the topic



Bruno Rodrigues
Intro: Jürgen Schneider

16 May 2024

Housekeeping

Please check:

rix installed

Nix installed (if not, we’ll do an alternative using github actions)


Alternative to installation of Nix
Do you have git installed?

Do you have a github account?


Workshop will be recorded
Only the presenter and the shared screen is recorded

If you want to be absolutely sure, switch off your camera and rename yourself if necessary

The DIPF Open Science Codex





Event series
“Open Science - Opportunities for Educational Research”


Organized
by the DIPF working group “Open Research and Practice”


Further events
see leibniz-openscience.de/event-calendar

The DIPF Open Science Codex





DIPF Open Science Codex est. January 23rd 2024

The DIPF Open Science Codex

(B. Nosek, 2019)

The DIPF Open Science Codex

(B. Nosek, 2019)

The DIPF Open Science Codex

(B. Nosek, 2019)

Reproducibility: The challenge



Imagine

  • For a conference, you want to run new analyses on subgroups of your sample
  • A reviewer of your journal article
    • suggests further (exploratory) analyses
    • asks you to share your data and code, so that s/he can check your analyses
  • For a meta-analysis, other researchers want to calculate the pre-post-correlation of your test

Reproducibility: The challenge



Imagine

  • For a conference, you want to run new analyses on subgroups of your sample
  • A reviewer of your journal article
    • suggests further (exploratory) analyses
    • asks you to share your data and code, so that s/he can check your analyses
  • For a meta-analysis, other researchers want to calculate the pre-post-correlation of your test

But when you/others re-run your code, R throughs error messages.

Reproducibility: The challenge



Crüwell et al. (2023): All articles from one issue in Psychological Science

Hardwicke et al. (2021): All articles with open data badge from Psychological Science in 2014-2015

Obels et al. (2020): 36 registered reports that shared both, code and data

Reproducibility: The challenge



Crüwell et al. (2023): All articles from one issue in Psychological Science

Hardwicke et al. (2021): All articles with open data badge from Psychological Science in 2014-2015

Obels et al. (2020): 36 registered reports that shared both, code and data



Computational reproducibility

“obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis” (NAS, 2018, p. 46)


“In principle, all reported evidence should be reproducible. If someone applies the same analysis to the same data, the same result should occur.” (B. A. Nosek et al., 2022, p. 721)

Reproducibility: The challenge



Artner et al. (2021): 232 scientific claims from 46 journal articles

Reproducibility: The challenge



Artner et al. (2021): 232 scientific claims from 46 journal articles

Reasons for lack of reproducibility





Renaming files
Hard-coding file paths

copy-paste errors
wrong rounding

Old package versions
Non-standardized computational environment (e.g., Older software versions)

(Batinovic & Carlsson, 2023)

Package management: {renv}


  • Ensures that everyone involved uses the same package versions
  • Creates project library within the project (as opposed to global library)


Very simple usage via

starts using renv in the current project with renv::init()
installs the needed packages as usual
document which packages are needed with renv::snapshot()
creates this environment in their project with renv::restore()




Alternative: {groundhog} package

Package management: {renv}

package versions comp. environment editable execution
package management if...

Containers: {holepunch}




Like a lightweight virtual machine
A small self-sufficient environment for a software application, so it runs consistently regardless of where it’s deployed

Containers: {holepunch}





  • R package holepunch containerizes your R project for you
  • AND provides a link for others to access it via their browser
  • With only a few lines of code
library(holepunch)
write_compendium_description(package = "Your compendium name", 
                             description = "Your compendium description")
write_dockerfile(maintainer = "your_name") 

generate_badge()

build_binder()

Containers: {holepunch}


Containers: {holepunch}

package versions comp. environment editable execution
package management if...
contrainerization if...

Nix

package versions comp. environment editable execution
package management if...
contrainerization if...
Nix (my hope ->)

Recording

Starting now.

Thank you



Jürgen Schneider

References

Artner, R., Verliefde, T., Steegen, S., Gomes, S., Traets, F., Tuerlinckx, F., & Vanpaemel, W. (2021). The reproducibility of statistical results in psychological research: An investigation using unpublished raw data. Psychological Methods, 26(5), 527–546. https://doi.org/10.1037/met0000365
Batinovic, L., & Carlsson, R. (2023, March). Why your code doesn’t reproduce: Lessons learned from Meta-Psychology. Unconference 2023: Open Scholarship Practices in Education Research.
Clyburne-Sherin, A., Fei, X., & Green, S. A. (2019). Computational Reproducibility via Containers in Psychology. Meta-Psychology, 3. https://doi.org/10.15626/MP.2018.892
Crüwell, S., Apthorp, D., Baker, B. J., Colling, L., Elson, M., Geiger, S. J., Lobentanzer, S., Monéger, J., Patterson, A., Schwarzkopf, D. S., Zaneva, M., & Brown, N. J. L. (2023). What’s in a Badge? A Computational Reproducibility Investigation of the Open Data Badge Policy in One Issue of Psychological Science. Psychological Science, 34(4), 512–522. https://doi.org/10.1177/09567976221140828
Hardwicke, T. E., Bohn, M., MacDonald, K., Hembacher, E., Nuijten, M. B., Peloquin, B. N., deMayo, B. E., Long, B., Yoon, E. J., & Frank, M. C. (2021). Analytic reproducibility in articles receiving open data badges at the journal Psychological Science : An observational study. Royal Society Open Science, 8(1), 201494. https://doi.org/10.1098/rsos.201494
Liu, D. M., & Salganik, M. J. (2019). Successes and Struggles with Computational Reproducibility: Lessons from the Fragile Families Challenge. Socius: Sociological Research for a Dynamic World, 5, 237802311984980. https://doi.org/10.1177/2378023119849803
NAS. (2018). Open Science by Design: Realizing a Vision for 21st Century Research (p. 25116). National Academies Press. https://doi.org/10.17226/25116
Nosek, B. (2019). Strategy for Culture Change. In Center for Open Science.
Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Dreber, A., Fidler, F., Hilgard, J., Kline Struhl, M., Nuijten, M. B., Rohrer, J. M., Romero, F., Scheel, A. M., Scherer, L. D., Schönbrodt, F. D., & Vazire, S. (2022). Replicability, Robustness, and Reproducibility in Psychological Science. Annual Review of Psychology, 73(1), 719–748. https://doi.org/10.1146/annurev-psych-020821-114157
Obels, P., Lakens, D., Coles, N. A., Gottfried, J., & Green, S. A. (2020). Analysis of Open Data and Computational Reproducibility in Registered Reports in Psychology. Advances in Methods and Practices in Psychological Science, 3(2), 229–237. https://doi.org/10.1177/2515245920918872

Credit

Title page #

Icons by Font Awesome CC BY 4.0

Input-Output-Documents


Built from Quarto or RMarkdown


Integrates

  • formatted text
  • media (pictureS/videos/…)
  • R code

executes R code
displays input alongside output

Example rendered to HTML or other outputs.

Input-Output-Documents

package versions comp. environment transparent editable execution
Input-Ouput-Documents